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U-statistics in stochastic geometry 


Raphael Lachieze-Rey and Matthias Reitzner 


Abstract A U-statistic of order k with kernel / : % k —> R f/ over a Poisson process 
is defined in [?] as 

Yj f( X 1) •■■,**) 

x u ...,x k eti^ 

under appropriate integrability assumptions on /. U-statistics play an important 
role in stochastic geometry since many interesting functionals can be written as U- 
statistics, like intrinsic volumes of intersection processes, characteristics of random 
geometric graphs, volumes of random simplices, and many others, see for instance 
[?, ?, ?]. It turns out that the Wiener-Ito chaos expansion of a U-statistic is finite 
and thus Malliavin calculus is a particularly suitable method. Variance estimates, 
the approximation of the covariance structure and limit theorems which have been 
out of reach for many years can be derived. In this chapter we state the fundamental 
properties of U-statistics and investigate moment formulae. The main object of the 
chapter is to introduce the available limit theorems. 


1 U-statistics and decompositions 
1.1 Definition 

Let X be a Polish space, k > 1, and / : JL k —> R he a measurable function. The 
U-statistic of order k with kernel / over a configuration Tj £ N V (X) is 0 if rj has 
strictly less than k points and the formal sum 
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U{f-,v)= E /(**) 

otherwise, where 77^ is the class of ^-tuples x k = (xi,... ,x k ) of distinct points from 
17. Remark that since the sum is over all such k-tuples, / can be assumed to be 
symmetric without loss of generality. 

An abundant literature deals with the asymptotic study of U(f;f ) p ) as p —> °° 
when f] p is a binomial process, i.e. a set of p iid variables over X. We are concerned 
here with Poisson input, i.e. 77 is a Poisson measure over X which intensity is a 
non-atomic locally finite measure /i on X. So that the definition makes any sense, 
the basic assumption is that / £ L\ (X*) = L\ (X*; g k ). 

In the sequel of this section, let 71 be a non-atomic locally finite measure on 
(X, SL\ 77 a Poisson measure with intensity p, and k > 1 . 


1.2 chaotic decomposition and multiple integrals 

Theorem 1 . Let f £ L\ (X*) such that U (/; 77) £ L 2 (P). We have the l? decompo¬ 
sition 


U{f\r]) = E I n (h n )- (1) 

77=0 

Here I„ is the n-th order stochastic integral over T) defined in Chapter [1], The 
functions h n have been explicitely computed in [?, Lemma 3 . 3 ], 

h n (x n ) = j^J(x n ,x k - n )dg k - n (x k - n ) (2) 

forx n £ X", and li n is a function ofL\ (X") nL 2 (X"). 

Remark 1 . Somewhat counterintuitively, / £ L\ (X*) C\L? S (X^) does not imply that 
Ef/(/; 77 ) 2 < 00 (see [?]), but in most examples / is bounded and has a bounded 
support, which makes the latter condition automatically satisfied. 

As is apparent in TheoremQ] each U -statistic of order A: is a finite sum of multiple 
integrals of order n < k, and it is not difficult to prove that conversely any multiple 
integral of order n > 1 can be written as a finite sum of (/-statistics which orders 
are smaller or equal to n. From a formal point of view, it is therefore equivalent 
to study the asymptotics of finite sums of U -statistics or of finite sums of multiple 
integrals, (/-statistics are more likely to appear in applications, but the homogene¬ 
ity of multiple integrals make them easier to deal with, and some of the Malliavin 
operators of U-statistics have a particularly intuitive form. Consider for instance the 
case where F = I k (f) is a multiple integral of order k. The Malliavin derivative, the 
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Orsntein-Uhlenbeck, and inverse Orstein-Uhlenbeck operators, take the following 
form 


D x F=kI k -i(f(x,-)),x€-X., LF = -kI k (f), L~ l F = -k~ l I k {f). (3) 

For a [/-statistic F, one can still derive D X F,LF,L using the linearity of those 
operators and the decomposition ([!]). 

The object of this section is really the study of sums of multiple integrals which 
order is bounded by some k > 1. The chaotic decomposition also yields that any L 2 
variable can be approximated by such a sum, allowing us in some cases to pass on 
limit theorems stated here to infinite sums. The following result gives the first two 
moments of U -statistics. 

Proposition 1. Let feL l s (JL k ). Then E\U(f; tj)| < °° and 

EU(f-,r,) = J^f(x k ) dp k (x k ). 

If furthermore U{f\Tj) £ L 2 ( P), 


Var([/(/;77)) = ^u!||/r„|| 2 (4) 

n =1 

where h n is given in Theorem\T\and ||/z„|| is the usual Lr (JL n )-norm. 

Proof The first statement is a direct consequence of the Slyvniack-Mecke for¬ 
mula, while the second stems from the orthogonality between multiple integrals 

I n (h „),0 <n<k. 


1.3 Hoeffding decomposition 

LetN> 1, f] p = {X\.... . X p ) be a family of i.i.d. variables with common distribution 
jX on X. Given a measurable kernel h over X/’, k > 1, the traditional Hoeffding 
decomposition (see e.g. Vitale [?]) is written 

U(h, i) P ) = k\ Q of (A) = k\ Q f Q <r£(H m ), 

where 

<(H m )=-L Y. H m (X h ,...,X im ), 0 <m<k, 

\m) l<h<h<-<‘m<P 


and each kernel H m is symmetric and completely degenerated, i.e. 
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,... fX m — i— / H m (x \,... ,x m -\,y)dp(y )— 0 
Jx 


for u 1 "' ' *-a.e. x \,... ,x m _i. This property implies in particular the orthogonality of 
the Om(H m ), 1 < n < k. If a is a probability measure, the H m are uniquely defined 
and can be expressed explicitely via an inclusion-exclusion formula 



(5) 


where h n is defined in (0. As is clear in this last formula, this decomposition is 
different from 0 because in the latter multiple integration is performed with respect 
to the compensated measure r/ — ft, while in Om(H m ) the compensation occurs in the 
kernel H m . 

The Hoeffding rank m\ is defined as the smallest index m such that |f/,„11 ^ 0, 
and we can see through 0 that it is equal to the smallest index n such that ||/t„|| ^ 0. 
We furthermore have H m] = ) */z mi . As proved in [?] for binomial processes or 

[?] for Poisson processes, the stochastic integral of order m\ dominates the sum, and 
limit theorems for geometric U -statistics can then be derived by studying this term, 
see Section l2.1.2l 


1.4 Contraction operators 

Let / £ £ L](X a ). If / and g satisfy the technical conditions defined in 

Chapter [?], one can define for 1 < r < l < min(c/.k) their contraction function of 
index (r, l), denoted /*[ g. It has k + q — r — l variables as arguments, decomposed 
in (x r _/,y ? _ r ,z^ r ), where £ JL r ~ l ,y q - r G and z^_ r £ lL k ~ r . We have 



Remember that each function appearing here is symmetric, whence the order of the 
arguments does not matter. Contraction operators are used below to assess the dis¬ 
tance between a stochastic integral and the normal law. See [?] for more information 
on contraction operators. 


2 Rates of convergence 


Let F be a Lr variable of the form 
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k 

F=Z In(h n ) (6) 

71=0 

for some kernels h n £ L 7 (X”). n > 1. We assume that those kernels satisfy the tech¬ 
nical conditions mentioned in Chapter [?] so that their mutual contraction kernels 
are well defined. This model englobes {/-statistics, as outlined by Theorem Q] as 
well as finite sums of {/-statistics and multiple integrals. 

In applied situations, the set-up consists of a fixed integer k > 1, and, for t > 0, 
a family of measures fi t on X, and a family of kernels h n .i £ L 7 (X''; ji "), 1 <n <k. 
We study the random variables 

( 7 ) 

72 = 1 

and more precisely the existence of numbers a t ,b, > 0 and of a random variable V 
such that 


Ft 


F t -at 
yfbt 


^ V 


in the weak topology. In all the applications, fi t is either of the form 

• fit = tji for some reference measure ji on the space X, or 

• jit = lx,M where X, C X depends on t. 

The following two settings occur in the most important applications. 

If T] = r/t is a Poisson point process on X = R f/ the measure li will often be the 
Lebesgue measure or for X = IR rf x M a product measure fi = ® v with a 

probability measure v on a topological marks space 

If t] = r/t is a Poisson ’flat 1 process on the Grassmannian X = ,<//f of affine i- 
dimensional subspaces (flats) of the intensity measure Li (- ) will be a translation 
invariant measure on . The Poisson flat process is only observed in a compact 
convex window W C R f/ with interior points. Thus, we can view r\ t as a Poisson 
process on the set [W] defined by 


[W] = |/te^ d : /iDW^0|. 


2.1 Central Limit theorem 

Let F be of the form ©. Let N ~ .„/L(0,1), o 2 = Var (F). The next result, which 
Wasserstein bound has been established in [?], and Kolmogorov bound in [?], gives 
a bound on the distance between F and N in terms of the contractions between the 
kernels of F. 


Theorem 2. Put 
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B(F) = max ( max\\h n -k l r h n \\,ma.x\\h n * l r h m \\, max \\h„\\ 2 I 
\ 1 2 n= l J 

B'(F) = max(|l - o 2 \,B(F),B(F) 3/2 ) 

where maxi is over 1 < r < n < k, 1 <l<r/\(n— 1), max 2 is over 1 < I < r <n < 
m < k. There exists a constant C k > 0 not depending on the kernels ofF such that 

d W (F,N) < a~ l C k B{F) (8) 

d K {F,N)<C k B'{F). (9) 

We reproduce here the important steps of the proof for the Wasserstein bound. 
The main result, due to Peccati, Sole, Taqqu, Utzet [?], is a general inequality on the 
Wasserstein distance between a Poisson functional F with variance a 2 > 0 having 
a finite Wiener-Ito expansion and the normal law. We have 

dw(F,N) < ^E[(<j2 - (D X F -D v L-iF) i2(x} ) 2 ] (10) 

[ mD,F) 2 \D x L- l F\]p{dx). 

<y ix 

To translate those inequalities into bounds on the contraction norms, we use the mul¬ 
tiplication formula from [?], which yields that the multiplication of mutiple integrals 
is a linear combinations of multiple integrals. For k,q >i ,/ez2(x«),sez2(x*), 

WMs) = X>!Q Q £ (^4+P-r-K/^), (ID 

where the symmetrized contraction kernels f~k l r g are the average of kernels f*' r g 
over all possible permutations of the variables. 

If for instance F = 4 (/) is a single multiple integral, (|3} gives the value of the 
Malliavin operators, and a computation then yields the bound ([8} with A = /; fi = 0 
for i A k, see [?, Prop. 5.5]. If F is a general functional with a finite decomposition, 
such as a {/-statistic (see ([TJ), Malliavin operators are computed using linearity and 
yield the bound ©, see the proof of Th. 3.5 in [?]. 

Concerning Kolmogorov distance, Schulte [?, ?] has derived a Stein bound sim¬ 
ilar to (ITol i. but with more terms on the right-hand side (Theorem 1.1), reflecting 
the effect that test functions are indicator functions, more irregular than the Lips- 
chitz functions involved in Wasserstein distance. This bound was later improved by 
Eichelsbacher and Thale [?, Th. 3.1], reducing the number of additional terms. With 
similar computations as in the Wasserstein case, one can then prove [?, Th. 4.1] that 
those additional terms only add contraction norms 11/]*[//11 3/ 2 at the power 3/2, up 
to a constant, yielding the bound B'{F). 

Remark 2. The terms in B'(F) bounding the Kolmogorov distance are smaller than 
the original terms present in B(F) if the bound goes to 0, and don’t change the bound 
magnitude or its eventual convergence to 0. 
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Remark 3. The constant C* explodes as k —> °°. In other papers [?], [?], similar 
bounds are derived in more specific cases, with a different method. The constants 
are more tractable and allow for instance to approximate accurately the distance 
from a Gaussian to an infinite series of multiple integrals by that of its truncation at 
some order (see for instance [?]). 

Theorem 3 (4th moment theorem ). Assume furthermore that kernels h^ are non¬ 
negative. Then for some C' k > 0 


B(F) < C' k V^F 4 -3o 4 . 


• In view of ©, the convergence of the 4-th moment to that of a Gaussian therefore 
implies central limit, with a bound for Wasserstein distance. In this case, as noted 
in [?], using © yields a similar bound for Kolmogorov distance. The positiveness 
of the kernels is adapted to (/-statistics with a non-negative kernel. 

• It is highly remarkable that the convergence of the 4-th moment to that of the 
Gaussian variable is therefore sufficient for such variables to converge to the 
normal law. The only technical requirement is that the variables F 4 are uniformly 
integrable. 

Example 1 (De Jong’s theorem). Let fa be a non-zero degenerate symmetric kernel 
from L] (X 2 ), i.e. such that 



This degeneracy property implies that U(fa,r\) = /(/ 2 ;rj),we also assume that fa £ 
L 4 (JL 2 ). De long [?] derived a 4-th moment central limit theorem for binomial U- 
statistics of the form U (fa; fj p ), where p £ 1\ goes to infinity and f\ p is a sequence of 
p iid variables with law ji. In the Poisson framework, © yields Berry-Essen bounds 
between F = U (fa; rj) = h(fa; t?) and N: 



d K (F,N) <C 2 ||^|pmax((7(/2),(7(/2) 3/2 ) 


max' 


where 


b(fa) = max (||/ 2 *0 fi II, ||/2*!/2||, IL/ 2 * 2 / 2 ll) ■ 


See Eichelsbacher and Thale [?, Th. 4.5] for details. In [?], Peccati and Thale derive 
bounds on the Wasserstein distance between such a U -statistic and a target Gamma 
variable, also in terms of contraction operators. 
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2.1.1 Local marked (/-statistics 

For many applications, it is useful to assume that the state space X is of the form 
S x M where S is a subset of R f/ containing the points t, of 77 , and (M,.///) is a 
mark space, i.e. a locally compact space endowed with some probability measure 
v. The space M contains marks m, that will be randomly assigned to each point of 
the process. In this setting, assume that rj, has intensity measure fi t = 1 ® V 

and let F t £ L 2 (P) be a L/-statistic F t = U (/; rj r ). Let the kernel / of the U-statistic 
be locally square integrable on X, = [— t l l d ,t l l d \ d x M and stationary, i.e. for p k - 
almost all (t^m*) £ X*,z £ R rf , 


f(t k +z,m k )=f(t k ,m k ). 


( 12 ) 


The tail behavior of the function / is fundamental regarding the limit of variables F, 

as t —> 

Definition 1. A measurable function / : (R d x M) k —> R is rapidly decreasing if it 
is locally square integrable, stationary, and if it satisfies the following integrability 
condition: There exists a non-vanishing probability density K on (R d ) fe_1 such that 
for p = 2,4, 



The slight abuse of notation /(0, L_] means that L = (0,L_i) = ( 0 ,f 2 > • • • ,t k - 1 )> 

and mi = (mi ,..., m k ). 

We have in this case the following result, which is a consequence of Theorem 6.2 
and Example 2.12-(ii) in [?] : 

Theorem 4. Let F, = U (/; 1 ],) where f is a rapidly decreasing function, and /I, = 
v with X r = [— t 1 ^,t 1 ^ d ) d x M. Then, with at = E F tl bt = Var (F t ), we have 
for some Ci, C 2 , C 3 >0 not depending on t, 

C\t <h, < C2t 
d w (F tl Ni)<C 3 r 1 / 2 . 

Remark 4. Reitzner & Schulte [?] first established this result in the case where / is 
the indicator function of a ball of R c/ (any non-vanishing continuous density K can 
be chosen in this case because /( 0 , •) has a compact support). 

Remark 5. A similar result holds if F is simply assumed to be a finite sum of stochas¬ 
tic integrals which kernels are rapidly decreasing functions, the U -statistics being a 
particular case. 
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2.1.2 Geometric U -statistics 

Coming back to the general framework, assume F, =U (/; p t ) where / £ L 2 (%f) is 
fixed and jj. t = ip for some measure p on X. Then F, admits the decomposition (|7} 
where 



One can then see that the term ||/ti j( || dominates the other terms in the variance 
expression (Q}, provided this term does not vanish. In any case the important feature 
is the Hoeffding rank of the U -statistic 


n\ :=inf{« : \\h n \\ ^ 0 }, 


because it turns out that I n] (h n] ,i) is the predominant term in ( 0 , in the sense that 
F, — 7 ;11 (h„ t j) = o(F t ) for the L? norm as t —> °°. It yields the following result (The¬ 
orem 7.3 in [?]). 

Theorem 5. For some Ci,C 2 ,C 3 >0 not depending on t, 


Cit 2k ~ ni <b t <C 2 t 2k ~ ni . 


(i) Ifni = 1, U (/; /i t ) follows a central limit theorem and 


dw(F t ,N)<C 3 r 1/2 , 
d K {F tl N)<C 2 r 1 / 2 . 


(H) Ifni > 1, U(f',p t ) does not follow a CLT and F t converges to a Gaussian chaos 
of order n\ (see [1, Theorem 7.3-2]). 

For a deeper understanding we refer to the proof of Theorem[ 6 ] 

Remark 6. Point (i) first appears in [?]. 

Remark 7. Point (ii) crucially uses the results of Dynkin & Mandelbaum [?]. 

Remark 8. The speed of convergence to the Gaussian chaos in (ii) is studied by 
Peccati and Thale [?] in case the limit is a Gamma distributed random variable. 


2.1.3 Regimes classification 

The crucial difference in Theorems |4]and[5]is the area of influence of a given point 
x S rj,. In the case of a local U -statistic, a typical point x G rj, interacts with a stochas¬ 
tically bounded number of neighbors, that are more likely near in view of Assump¬ 
tion [12] The situation is different for a geometric [/-statistic, where a point poten¬ 
tially interacts with any other point, regardless of the distance. Both these regimes 
can be seen as two particular cases of a continuum. 
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Let a, > 0 be a scaling factor, X, = [— t l l d ,t l l d ] d x M, p t = tx.fd ® V, and 
F t = U {f\t]i), where f is obtained by rescaling a rapidly decreasing function /: 

ft{x k )=f{a t x k ),x k €'X k . (13) 

Say that / has non-degenerate projections if the functions 

fn{x n )= [ f{x n ,X k _ n )dl 4 ~ n ,Xn G X", 

well defined in virtue of (fl2l) . are not tt-a.e. equal to 0. It is trivially the case if for 
instance / ^ 0 and / > 0 /l-a.e.. Concerning notations, every spatial transformation 
of a point x = (t,m) G x M, such as translation, rotation, or multiplication by a 
scalar number, is only applied to the spatial component t. 

Subsequently, any spatial transformation applied to a {-tuple of points x k = 
(xi ,... ,x k ) is applied to the spatial components of the x,’s. The quantity v, = af d 
is relevant because it gives the magnitude of the number of points interacting with a 
typical point x. The case v t = CCt = 1 is that of local {/-statistic. If v t = 1 is roughly 
the volume of X,, it corresponds to geometric {/-statistics. In this case it is useless 
to assume that / is rapidly decreasing, as only the behavior over X | is relevant for 
the problem. 

Theorem 6. Assume that f t is of the form M 3D . where f is a rapidly decreas¬ 
ing function with non-degenerate projections. With the notations above, there are 
Ci,C2,Cj 0 such that 


C i< 


_ bt_ _ 

t v} k ~ 2 max( 1 , vf k+1 ) 


< C2 , 


and 


dw(F t ,N) < C^t 1 / 2 max(l,v I k+x ) l l 2 
dK(F t ,N) < C3f“ 1 / 2 max(l,v f ** :+1 ) 1 / 2 . 

Concerning the bound for Kolmogorov distance, it is not formally present in the 
literature. It relies on the fact that in Theorem^ If (F) < CB (F) for some C > 0 in 
the case where <7 —> 1 and B(F) —> 0. Then one can simply reproduce the proof of 
[?], entirely based on an upper bound for B(F). 

Remark 9. Theorems [4] and [5}(i) can be retrieved from this theorem by setting re¬ 
spectively V; = 1 or v t =t. 

Remark 10. If some projections do vanish, the convergence rate can be modified, 
and the limit might not even be gaussian, as it is the the case for the degenerate 
geometric {/-statistics of Th.[5}(ii). 

Remark 11. Depending on the asymptotic behavior of v t , we can identify four dif¬ 
ferent regimes: 
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1. Long interactions: v t —> CLT at speed t l/2 , the first chaos I\ l (h l \) domi¬ 

nates (geometric U -statistics). 

2. Constant size interactions: v t = 1, CLT at speed t~ l/2 , all chaoses have the 
same order of magnitude (local {/-statistics). 

3. Small interactions: v, —> 0,fv ; ~ A+1 — > °°, CLT at speed higher or¬ 

der chaoses dominate. In the case of random graphs (k = 2), the corresponding 
bound in ( tv t has been obtained in [?]. 

4. Rare interactions: tv^ k+l —> c < °°, the bound does not converge to 0. In the 
case k = 2, it has been shown in [?] that there is no CLT but a Poisson limit in 
the case c > 0 (see Chapter [?] for more on Poisson limits). 


2.2 Other limits and multi-dimensional convergence 

Besides the Gaussian chaoses appearing in Theorem 0-(ii), some characterizations 
of non-central limits have also been derived for Poisson {/-statistics. 


2.2.1 Multidimensional convergence 

We consider in this section the conjoint behavior of random variables F t = (F \., F^i ) 
where F m j = I qm (h m ,t) for 1 < m < k, with h mt £ L 2 (X 9ra ) for some q m > 1 , for 
{ > 0 ; 1 < m < k. 

Call Of = Y. k m =] Var(f m ,). Any Lr candidate for the limit of o, 1 h] should have 
as covariance matrix 

t'tn.n lime, IE F m jF n f^ 1 / ni.u A k 

if those limits exist. In this case there is indeed asymptotic normality if all contrac¬ 
tion norms 


|| hm,t ^ n ! II 

go to 0 for r = 1,....(//.. and every { = 1,... ,r A (qi<— 1), under technical conditions 
on the kernels related to technical condition of chapter [?], see [?, Th. 5.8],[?, Th. 
2.4] for details. These articles contain explicit bounds on the speed of convergence 
with a specific distance related to thrice differentiable functions on (R d ) k , and the 
convergence is stable, in the sense of [?]. 

If now F t = (F\ i.... jF^ i) where each F m t is a U -statistic, one can consider the 
random vector G t constituted by all multiple integrals with respect to kernels from 
the decompositions of the F m h as defined in (0. One can then infer conditions for 
asymptotic normality of F, by applying the previous considerations to G,. 

As noted in RemarkQTI some U -statistics behave asymptotically as Poisson vari¬ 
ables. Asymptotic joint laws of {/-statistics can also converge to random vectors 
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with marginal Poisson laws, and it can also happen that they converge to an hy¬ 
brid random vector which has both Gaussian and Poisson marginals, here again the 
reader is referred to Chapter [?]. 


2.2.2 Gamma 

Similar results to those of Section 12.11 with Gamma limits have been derived by 
Peccati and Thale [?] for Poisson chaoses of even order. The distance used there is 

d 3 (U,V) = sup |E/z(t/) — h(V)\ 
fcejr 3 

where J ^ 3 is the class of functions of class tC 3 with all first 3 derivatives uniformly 
bounded by 1. We again denote by f* ! r g the symmetrized contraction kernels . 

For v > 0, let F{v/ 2) be a Gamma distribution with mean and variance both 
equal to v/2. We introduce the centered unit variance variable G(v) := 2F(v/2) — 

v. 

Theorem 7. Let F = 4 (hk)f or some even integer k > 2. 

We have 

rf 3 (4(%),G(v))<Z),max{k!||/z,||-2v;||/t ifc ^/t it ||;||/z ife 4/z i || 1 / 2 ;||^2^- c ^ll} 

where the maximum is taken overall p = 1 such that p ^ k/2 and all (rj) 

such that r ^ / and l = 0, or r € {1,... ,k} andl £ {1,... ,min(r,k— 1)}. Also 

Ck= ^m q q , 2 ) 2 ' 

Remark 12. In the case of double integrals ( k = 2), the authors of [?] provide a 
4th moment theorem, in the sense that under some technical conditions, a sequence 
of double stochastic integrals converge to a Gamma variable if their first moments 
converge to those of a Gamma variable. 

Remark 13. This result enables to give an upper bound on the speed of convergence 
to the second Gaussian chaos in Theorem[5]in the case n\ =2, if this limit is indeed 
a Gamma variable. 


3 Large deviations 

There are only few investigations concerning concentration inequalities for Poisson 
U-statistics. Most results require an nice bound on sup JJGN ( X \ D Z (F) < °°. For 
U-statistics of order > 2 this condition is not satisfied, even if / is bounded. For 
U-statistics of order 1, this holds if |/||„ < °o. Therefore we split our investigations 
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into a section on U-statistics of order one and on higher order local U-statistics. We 
start with a general result. Throughout this section we assume that f > 0 and / ^ 0. 


3.1 A general LDI 

In this section we sketch an approach developed in [?] leading to a general concen¬ 
tration inequality. For two counting measures t] and v we define the difference 17 \v 
by 

*7\v = £ (t/(x) - v(x))+ S x . (14) 

l€X 

For x G 77 and / € L\ (K k ), we recall that 

U(f-,'n)=Y J F(x-.ri) with F(x;tj)= £ /(fWi)- 

• ceT? xj.iefrjMx})^" 1 

Assume that in addition to 17 a second point set £ G N(X) is chosen. The non¬ 
negativity of / yields 

t/(/;r|) <U(/;0+k£F(*;r7)l(^0 

xer/ 

= U(f;Q+kjF(x;r 1 )d(r 1 \Q. 


The convex distance of a finite point set rj G N(X) to some A C N (X) was intro¬ 
duced in [?], and is given by 


<(?7,A) 


max min / u d(r\\C) 

!M| 2 ,jj<i aJ 


where u : X —» IR, + is a non-negative measurable function and \\u\\ 2 ^ = / u 2 dr\. To 
link the convex distance to the U-statistic, we insert for u the normalized function 
||F(x; 17)||2 \F(x; r 7) anc * rewr i te U(rj) in terms of the convex distance as follows: 

d* T (n,A) > mmj 

If we assume F(x;r|) < B, then ||F(x; 17 ) || | ^ < 5L tet) F(x\f]) = BU(f\r\), 
which implies 

d$(ri,A) > l(VxG q : F(x;ri)<B). (15) 

k\JB C,eA \JU(f\r\) 
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In [?], a LDI for the convex distance was proved. For rj a Poisson point process, 
and for A C N(X), we have 

P(A)PK(rj,A)>j) < exp • 

Precisely as in [?], this concentration inequality combined with the estimate (fl5l > 
yields the following theorem. 

Theorem 8. Assume that e(-) and fi 6 R satisfy P(3x € r) : F (x\ r\ ) > B ) < e(B). 
Let m be the median ofU (/; q). Then 

>,)< 4exp + 3 e(S). (16) 

In the next sections we apply this to U-statistics of order one and to local U-statistics. 
In the applications, the crucial ingredient is a good estimate for e(B). 


3.2 LDI for first order U-statistics 

There are several concentration inequalities for integrals over Poisson point pro¬ 
cesses, i.e. U-statistics of order one, 

=£/(*)= lfdrhf> 0 

xer] J 

in which case D Z U = f{z). Assuming that ||/||oo = B < “ we have 

\\D Z U\U<B. 

A result by Houdre and Privault [?] shows that 

P(t/-Il/Il 1 >«)<«p(-Mi*( 1 ^ 7 )) (17, 

where g(u) = (1 + w) ln( 1 +«) — «, u > 0 and because / > 0 the 1 -norm equals 
the expectation El/ (/; ij). A similar result is due to Ane and Ledoux [?]. Reynaud- 
Bouret [?] proves an estimate involving the 2-norm ||/||7 instead of the 1-norm. A 
slightly more general estimate is given by Breton et al. [?]. 

We could also make use of Theorem[ 8 ]and choose B = ||/||oo. This yields 

P(|C'(/;D-».| > «) < 4exp (- 4||/|| _“. + m) ) ■ US) 


which is a slightly weaker estimate than (fT71 >. 
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In this paragraph we assume that X is equipped with a distance and B(x,r) denotes 
the ball of radius r around x € X. If U is a local U-statistic which is concentrated on 
a ball of radius St, we have 

F{x,ri) < ||/||<»Tl(£(x,$))* -1 


P(3x: F(x;rj) > B) < E £ l(F(^;rj) > B) 

xer\ 

< [ P(F(x;t]) > B)dn, 
Jx 


and it remains to estimate 

p Hmf)- 

We use the Chernol'f bound for the Poisson distribution, namely 

P( 77 ,(*,$)) > r) < mfe E ^- l '>~ sr , (19) 

s> 0 

because 77 (B(x, S t )) is a Poisson distributed random variable with mean 

E(x) := Erj,( B d (; x,8,))=n,(B d (x, S, )) < sup 71 , (B d (x, 5, )) =: E. (20) 

*ex 

Because inf s >o E (e s — 1) — sr = r(l — In (r/E)) — E we estimate the right hand side 
of ( fl9l i by exp (— jrj for Ee 2 < r. This leads to 

P(3x: F(x;t])>B) < jlt,(X)exp j '■= £ ( B ) 


for B > E k *e 2 (* ^||/|| M . We set B = |1/||«( ( u + m ) an d combine this with the 
general Theorem[8] 

Theorem 9. Set E := sup > xe j L Ht(B d {x,S t )). Then for > E k e lk \\f\\ 

P(|£/(/;t 7) — m|> M )<4^(X)exp^-^2||/||“"(-^— )^. 


Clearly, in particular situations more careful choices of e(B) and B lead to more 
precise bounds. 







16 


Raphael Lachieze-Rey and Matthias Reitzner 


4 Applications 

In this section we investigate some applications of the previous theorems in stochas¬ 
tic geometry. In all these cases X is either a subset of or a subset of the affine 
Grassmannian srff, the space of all /-dimensional spaces in R' 7 . 

We state some normal approximation and concentration results which follow 
from the previous theorems. In many cases multi-dimensional convergence and con¬ 
vergence to other limit distributions can be proved in various regimes. We restrict 
our presentation to certain ‘simple’ cases without making any attemp for complete¬ 
ness. Our aim is just to indicate recent trends, we refer to further results and inves¬ 
tigations in the literature. 


4.1 Intersection process 

Let T) t be a Poisson process on the space srff with an intensity measure of the form 
jti(’) = tO(-) with t G R 1 and a a-finite non-atomic measure 0. The Poisson flat 
process is only observed in a compact convex window W C R ' 7 with interior points. 
Thus, we can view r), as a Poisson process on the set X = [W] defined by 

[w] = j/ieAf: hnw^mj. 

Given the hyperplane process rj t , we investigate the (d — k(d — /))-flats in W 
which occur as the intersection of k planes of /],. Hence we assume k < d/(d — /). 
In particular, we are interested in the sum of their y-th intrinsic volumes given by 

<j>, = <i> t (w,i,k,j) = Y, Vj(hin...nh k rw) 

' (h u -,h k )eri^ 

for 7 = 0 ,... ,d — k(d — /), i = 0, — 1 and k=l,...,\d/(d — i)\. For the defini¬ 

tion of the 7 -th intrinsic volume Vj{ ) we refer to the Chapter 2 of the current book. 
We remark that Vq(K) is the Euler characteristic of the set K, and that V„(K) of an 77 - 
dimensional convex set K is the Lebesgue measure l n (K). Thus 4>, (W. i, 1,0) is the 
number of flats in W and <J>, ( W , i, k, d — k(d — /)) is the ( d — k(d — /))-volume of their 
intersection process. To ensure that the expectations of these random variables are 
neither 0 nor infinite, we assume that 0 < 0 ([W]) < °°, and that 2 <k< \d/(d — i )J 
independent random hyperplanes on [W] with probability measure 0 (-)/ 0 ([W]) in¬ 
tersect in a (d — k(d — /))-flat almost surely and their intersection flat hits the interior 
of W with positive probability. For example, these conditions are satisfied if the hy¬ 
perplane process is stationary and the directional distribution is not concentrated on 
a great subsphere. 
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The fact that the summands in the definition of are bounded and have a 
bounded support makes sure that all moment conditions are satisfied and we can 
apply Theorem|5j 

Theorem 10. Let N be a standard Gaussian random variable. Then constants c = 
c(W,i,k,j ) exist such that 


dw(&t,N)<ct */ 2 , 
d K (4>t,N) < ct~ l/2 , 


for t > 1. 

Furthermore, it can be shown [?] that the asymptotic variances satisfies 
Var<J>, = C 0 t 2k ~\\ T (? (1)) as t —°° with a constant C& = C<p(Wf,k,j). The order 
of magnitude already follows from the first part of Theorem[5] 

For more information we refer to [?] and [?]. In the second paper the Wiener- 
Ito chaos expansion is used to derive even multivariate central limit theorems in an 
increasing window for much more general functionals <P. 


4.2 Flat processes 

For / < 4 two /-dimensional planes in general position will not intersect. Thus the 
intersection process described in the previous section will be empty with probability 
one. A natural way to investigate the geometric situation in this setting is to ask 
for the distances between this /-dimensional planes, or more general for the so- 
called proximity functional. The central limit theorems described in the following 
fits precisely into the setting of this contribution, we refer to [?] for further results. 

Let r) t be a Poisson process on the space Aid. i) with an intensity measure of the 
form /i r (-) = /(?(•) with t £ R + and a cr-finite non-atomic measure 6. The Poisson 
flat process is observed in a compact convex window W C R f/ . To two /-dimensional 
planes in general position there is a unique segment [xi, Jt' 2 ] with 

d(h 1 ,h 2 ) = \\x2-x 1 \\= min ||z-y||. 

The midpoints mijixfif) = \{x\ +x 2 ) form a point process of infinite intensity, 
hence we restrict this to the point process 

{m(hi,h 2 ) : d(h\,h 2 ) < 8fii,h 2 £ t]^} 

and are interested in the number of midpoints in W. 

n,=n t (W,8) = X - £ l(d(h u h 2 )<8,m(huh2)£W) 
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It is not difficult to show that ETZ, is of order t 2 8 d ~ 2 '. The U-statistic 17, is local on 
the space sA d . Thus the following theorem due to Schulte and Thaele [?] is in spirit 
similar to Theorem[4] 

Theorem 11. Let N be a standard Gaussian random variable. Then constants 
c(W. i) exist such that 


d K (n t ,N)<c(w,i)r‘ J T i . 


fort > 1. 

Moreover, Schulte and Thale proved that the ordered distances form after suitable 
rescaling asymptotically an inhomogeneous Poisson point process on the positive 
real axis. 

We add to this a concentration inequality which follows immediately from The- 
orem[9] Observe that = tO({W}). 

Theorem 12. Denote by m the median ofTl t . Then 




for 


4.3 Gilbert graph 

Let rji be a Poisson point process on R d with an intensity-measure of the form 
= t£d{- C\W), where £d is Lebesgue measure and W C a compact convex 
set with £d(W) = 1. Let (5, : t > 0) be a sequence of positive real numbers such that 
(5, — > 0, as t — > °°. The random geometric graph is defined by taking the points of t], 
as vertices and by connecting two distinct points x,y £ ri t by an edge if and only if 
||x — y|| < S t . The resulting graph is called Gilbert graph. 

There is a vast literature on the Gilbert graph and one should have a look at 
Penrose’s seminal book [?]. More recent developments are due to Bourguin and 
Peccati [?], Lachieze-Rey and Peccati [?, ?] and Reitzner, Schulte and Thale [?]. 

In a first step one is interested in the number of edges 


A/,=A/,(W,S,) = i £ l(||x-y||<5,) 



of this random geometric graph. It is natural to consider instead of the norm func¬ 
tions l(/(y —x) < (5,) and instead of counting more general functions g(y — x): 




\ 2 
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For simplicity we restrict our investigations in this survey to the number of edges N t 
in the thermodynamic setting where tSf tends to a constant as t —> Further results 
for other regimes, multivariate limit theorems and sharper concentration inequalities 
can be found in Penrose’s book and the papers mentioned above. 

Because of the local definiton of the Gilbert graph, N, is a local U-statistic. The- 
orem[ 6 ]with v, = tSf can be applied. 

Theorem 13. Let N be a standard Gaussian random variable. Then constants c(W) 
exist such that 


d w (N,,N)<c{W)r 1/2 , 
d K {N t ,N)<c{W)t~ l/2 . 


for t > 1 . 

A concentration inequality follows immediately from Theorem [9] Observe that 

PtOQ =t£ d (W). 

Theorem 14. Denote by m the median ofN,. Then there is a constant cj such that 
F(\n t -m\ >u)< 4flrf(W)exp (—-rr 7 =— ) 

\ it) y/II+ 111/ 


for 


> Cd- 


In [?] a concentration inequality for all u > 0 is given using a similar but more 
detailed approach. 


4.4 Random simplicial complexes 

Given the Gilbert graph of a Poisson point process 77 , we construct the Vietoris-Rips 
complex R(5t) by calling F = {x,-,,... ,x (( . +l } a k —face of R(S t ) if all pairs of points 

in F are connected by an edge in the Gilbert graph. Observe that e.g. counting the 

(k) 

number A r of f -faces is equivalent to a particular subgraph counting. By definition 
this is a local U-statistics given by 

At« = AtW(w,5,) = -^— £ l(||x,- —xy|| < S t , VI <i,j<k+ 1). 

^ x u ...,x M erjfy 1 

Central limit theorems and a concentration inequality follow immediately from the 
results for local U-statistics. We restrict our statements again to the thermodynamic 
case where tSf tends to a constant as t —> °°. Results for other regimes can be found 

(k) 

e.g. in Penrose’s book. Because of the local definiton of the Gilbert graph, N, is a 
local U-statistic. Theorem[ 6 ]with v, = t8f can be applied. 
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Theorem 15. Let N be a standard Gaussian random variable. Then constants c(W) 
exist such that 


dw{N^ k \N)<c{W)r l/2 , 
d K (Nf k \N) < c(W)r 1 / 2 . 


for t> 1. 

A concentration inequality follows immediately from Theorem [9] Observe that 

vt(x) = te{\w]). 

Theorem 16. Denote by in the median ofN,. Then 




Much deeper results concerning the topology of random simplicial complexes 
are contained in [?, ?] and [?]. We refer the interested reader to the recent survey 
article by Kahle [?] 


4.5 Sylvester’s constant 

Again we assume that the Poisson point process T] has an intensity-measure of the 
form p,f) = tld{' O W), where is Lebesgue measure and W C a compact 
convex set with = 1 . 

As a last example of a U-statistic we consider the following functional related 
to Sylvester’s problem. Originally raised with k = 4 in 1864, Sylvester’s original 
problem asks for the distribution of the number of vertices of the convex hull of 
four random points. Put 

N t =N t (W,k)= V ll(xi,... ,xr- are vertices of conv(xi,... ,xf)), 




which counts the number of /.'-tuples of the process such that every point is a vertex 
of the convex hull, i.e., the number of k-tuples in convex position. 

The expected value of U is then given by 

E Nt = r A P(Wi,... ,Xk are vertices of convpfi,..., Aj.)) = t k p(W,k ), 

where X \,..., W are independent random points chosen according to the uniform 
distribution on W. 

The question to determine the probability p(W,k ) that k random points in a con¬ 
vex set W are in convex position has a long history, see e.g. the more recent de- 
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velopment by Barany [?]. In our setting, the function t~ k N, is an estimator for the 
probability p(W,k) and we are interested in its distributional properties. 

The asymptotic behaviour of Var(/V,) is of order t 2k 1 . Together with Theorem 
[3 we immediately get the following result showing that the estimator H is asymp¬ 
totically Gaussian: 

Theorem 17. Let N be a standard Gaussian random variable. Then there exists a 
constant c(W,k) such that 


d w {N tl N ) <c{W,k)t~ 2. 
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